A colleague sampled the occurrence (0 or 1) of Canada lynx at 75 grid cells using camera-traps. They are interested in you helping 1) fit a model to estimate the occurrence probability of lynx and investigate the influence of two variables, and 2) use these estimates to inform future study design scenarios.
A colleague used camera traps to sample whether a lynx was present
(1) or assumed absent (0) at each grid cell or ‘site’ (y)
during the winter (December to February); we will assume there are no
false-positives or false-negatives in these data. They designed the
sampling and site selection such they had variation in two important
covariates: the distance the camera was from a road
(dist.road) and the percentage of forest cover
(cover). Their hypothesis is that lynx will avoid human
activity by occurring further from roads when they are not under cover,
but will occur near roads that are under cover as they are able to
remain hidden.
Fit the data (lynx.data.csv) using a model that captures
the hypothesis of your colleague.
Your colleague would like these data to inform them on whether they are sampling enough grid cells/sites. They specifically want to know whether the sample size provides them enough statistical power to be confident that they will reject the null hypothesis of no difference with zero for each of the estimated coefficient at a type I error rate of 0.05. The power they want to achieve is 0.80 probability. Note- in a simulation context - think of getting the sampling distribution of the p-value for each coefficient and evaluating whether the proportion of p-values is below the type I error rate.
Use the estimated coefficients to simulate many data sets (>1000). Fit each data set using the same model as you used to fit the empirical data. Extract the p-value of each coefficient to evaluate whether there is adequate statistical power based on your colleagues desire. If there is not adequate power, consider increasing the sample size.